CrystalGPU: Transparent and Efficient Utilization of GPU Power
نویسندگان
چکیده
General-purpose computing on graphics processing units (GPGPU) has recently gained considerable attention in various domains such as bioinformatics, databases and distributed computing. GPGPU is based on using the GPU as a co-processor accelerator to offload computationally-intensive tasks from the CPU. This study starts from the observation that a number of GPU features (such as overlapping communication and computation, short lived buffer reuse, and harnessing multi-GPU systems) can be abstracted and reused across different GPGPU applications. This paper describes CrystalGPU, a modular framework that transparently enables applications to exploit a number of GPU optimizations. Our evaluation shows that CrystalGPU enables up to 16x speedup gains on synthetic benchmarks, while introducing negligible latency overhead.
منابع مشابه
High Efficient Transparent TiO2 Nanotube Dye-Sensitized Solar Cells: Adhesion of TiO2 Nanotube Membrane to FTO by Two Different Methods
In order to fabricate transparent TiO2 nanotube dye-sensitized solar cells, anodically growth nanotube membranes are detached from Ti substrate by a re-anodization method. The membranes are transferred on FTO glass by two different methods. At the first one, 100mM Ti-isopropoxide is used to make TiO2 nanoparticles for adhering TiO2 nanotube membranes to FTO and ...
متن کاملEfficient Resource Sharing Through GPU Virtualization on Accelerated High Performance Computing Systems
The High Performance Computing (HPC) field is witnessing a widespread adoption of Graphics Processing Units (GPUs) as co-processors for conventional homogeneous clusters. The adoption of prevalent SingleProgram Multiple-Data (SPMD) programming paradigm for GPU-based parallel processing brings in the challenge of resource underutilization, with the asymmetrical processor/co-processor distributio...
متن کاملEfficient CPU-GPU cooperative computing for solving the subset-sum problem
Heterogeneous CPU-GPU system is a powerful way to accelerate compute-intensive applications, such as the subset-sum problem. Many parallel algorithms for solving the problem have been implemented on graphics processing units (GPUs). However, these GPU implementations may fail to fully utilize all the CPU cores and the GPU resources. When the GPU performs computational task, only one CPU core is...
متن کاملEfficient GPU Implementation of the Integral Histogram
The integral histogram for images is an efficient preprocessing method for speeding up diverse computer vision algorithms including object detection, appearance-based tracking, recognition and segmentation. Our proposed Graphics Processing Unit (GPU) implementation uses parallel prefix sums on row and column histograms in a cross-weave scan with high GPU utilization and communication-aware data...
متن کاملTowards ParadisEO-MO-GPU: A Framework for GPU-Based Local Search Metaheuristics
This paper is a major step towards a pioneering software framework for the reusable design and implementation of parallel metaheuristics on Graphics Processing Units (GPU). The objective is to revisit the ParadisEO framework to allow its utilization on GPU accelerators. The focus is on local search metaheuristics and the parallel exploration of their neighborhood. The challenge is to make the G...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1005.1695 شماره
صفحات -
تاریخ انتشار 2009